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/  ABSTRACT 

This  paper  studies  how  to  identify  influential  observations  in  univariate 
ARIMA  time  series  models  and  how  to  measure  their  effects  on  the  estimated 
parameters  of  the  model.  The  sensitivity  of  the  parameters  to  the  presence  of 
either  additive  or  innovational  outliers  is  analyzed,  and  influence  statistics 
based  on  the  Mahalanobis  distance  are  presented.  The  statistic  linked  to 
additive  outliers  is  shown  to  be  very  useful  to  Indicate  the  robustness  of  the 
fitted  model  to  the  given  data  set.  Its  application  is  illustrated  using 
simulation  results  and  a  relevant  set  of  historical  data.  -J-/ 
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SIGNIFICANCE  AND  EXPLANATION 


Observed  time  series  almost  always  have  atypical  points  that  are  produced 
by  nonsystematic  changes  in  the  variables  that  are  driving  the  series.  As  the 
univariate  forecasts  from  any  univariate  time  series  model  are  based  on  the 
extrapolation  of  the  historical  patterns,  if  the  parameters  of  the  series  are 
very  much  dependent  on  a  few  atypical  observations  linked  to  non-repeatable 
events  then,  the  quality  of  these  forecasts  can  be  very  poor.  Furthermore, 
the  identification  of  these  observations  is  very  important  in  order  to  check 
the  robustness  of  the  fitted  model. 

This  paper  studies  how  to  measure  the  influence  of  each  observation  on 
the  estimated  parameters  of  a  time  series  ARIMA  model.  The  effect  of  either 
additive  or  innovational  outliers  is  analyzed,  and  simple  expressions  are 
obtained  to  measure  their  effects.  A  statistic  is  introduced  that  seems  to  be 
very  useful  to  indicate  influential  observations  and  to  judge  the  general 
robustness  of  the  fitted  model. 


The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive 
summary  lies  with  MRC,  and  not  with  the  author  of  this  report. 
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1.  INTRODUCTION  AND  SUMMARY 


Observed  tine  aeries  almost  always  have  atypical  points.  These  anomalous  values  are 
produced  by  nonsystematic  changes  in  the  variables  that  are  driving  the  series  or  affecting 
them.  As  the  forecasts  from  any  time  series  model  are  based  on  the  extrapolation  of  the 
historical  patterns,  if  the  parameters  of  the  series  are  very  much  dependent  on  a  few 
atypical  observations,  resulting  from  isolated  or  non-repeatible  events,  then,  the  quality 
of  the  forecasts  can  be  expected  to  be  poor.  Also,  if  the  peresMters  of  the  model  have 
physical  or  economical  interpretation,  the  presence  of  undetected  influential  observations 
con  mislead  the  scientist  about  the  properties  of  the  model.  Finally,  the  study  of  these 
observations,  that  is  the  sensitivity  of  the  model  to  the  given  data  set,  provides  meaning¬ 
ful  Information  about  the  robustness  of  the  fitted  model. 

This  problem  is  related  to,  although  different  from,  the  study  of  outliers  because  it 
is  well  known  that  the  fact  that  an  observation  is  an  outlier  does  not  imply  this  observa¬ 
tion  affects  substantially  the  parameter  estimates  of  the  assumed  model,  although  in 
general  it  will  affect  the  varianca  of  the  estimates. 

Cook  and  Weiaberg  (1982)  and  Balsley,  Kuh  and  Welsch  (1980)  present  an  overview  of 
Influential  observations  in  the  regression  model.  This  study  has  been  extended  to  seme 
other  members  of  the  generalised  linear  model  family.  (See  Pregibon  (1981).)  Briefly,  the 
main  idea  of  this  approach  is  to  delete  suspicious  observations  and  build  a  measure  of  the 
change  that  this  deletion  produces  in  relevant  features  of  the  model,  such  as  the  estimated 
parameter  values,  or  the  forecasts. 
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The  study  of  influential  observations  has  been  limited  so  far  to  independent  data. 

This  paper  attempts  to  extend  these  ideas  to  dependent  observations  in  the  context  of  time 
series  analysis  and  is  organized  as  follows.  Section  2  summarizes  the  literature  of 
outliers  in  time  series  models  and  discusses  the  two  basic  types  of  outliers  that  can  occur 
in  a  dynamic  situation.  Sections  3  and  4  show  how  to  build  measures  of  influence  for 
additive  outliers  and  for  innovational  outliers.  Section  5  presents  some  simulated 
examples  of  the  behavior  of  these  statistics  that  are  then  applied  in  section  6  to  the 
study  of  the  robustness  of  a  time  series  model. 

The  main  result  from  this  paper  is  to  present  a  statistic  that  can  be  easily  computed 
and  seems  to  be  very  useful  in  indicating  the  observations  that  have  strong  influence  on 
the  estimated  parameter  values.  Thus,  this  statistic  can  be  incorporated  easily  into 
standard  time  series  analysis  practice  and  provides  a  quick  and  simple  way  to  judge  the 
robustness  of  the  fitted  model.  Second,  a  simple  expression  has  been  found  that  relates 
the  parameter  values  estimated  with  and  without  outliers  in  arima  models.  This  expression 
is  used  to  prove  that  additive  outliers  are  expected  to  be  much  more  influential  than 
inovational  outliers.  Third,  on  the  assumption  of  innovational  outliers  the  relevant 
statistics  for  influential  observations  are  identical  to  those  desired  in  the  regression 
situation  but  their  usefulness  seems  to  be  small  in  the  time  Beries  context. 


2.  OUTLIERS  IN  TIMB  SERIES 


Fox  (1972)  defines  two  types  of  outlier  which  stay  occur  in  tine  series  date.  The 
first,  called  type  Z  outlier  by  Fox,  corresponds  to  a  modification  of  the  value  of  the 
observed  series  due  to  some  external  cause,  as  a  gross  recording  error  or  a  intervention  at 
sane  point,  Resuming  that  the  observed  series  zt  follows  and  autoregression  moving 
average  model,  the  model  for  a  type  I  outlier  ist 


*<B)yt  -  6(B)at 


»t 


t  ?  T 
t  -  T 


where  B  is  the  backshift  operator,  Bkyt  •  yt_j,  and  9(B)  ■  1"9  g^B^  and 

8(B)  -  1-6 .B  8  Bq  are  the  autoregressive  and  moving  average  polynomials .  Ibis  model 

1  <1 

could  also  be  written  ast 
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(2.1) 


which  points  out  that  this  model  is  a  special  case  of  intervention  analysis  (Box  and  Tiao 
(1975))  with  a  instantaneous  reponse  function,  w.  Model  (2,1)  has  been  called  the 
additive  outlier  model  by  Denby  and  Martin  (1979),  and  Chang  and  Tiao  (1983),  and  the 
aberrant  observation  model  by  Abraham  and  Box  (1979), 

The  type  I  outlier  can  be  interpreted  as  the  effect  on  the  series  of  some  external 
even  or  exogenous  change  in  the  system.  On  the  other  hand,  the  second  type  of  outlier  can 
be  considered  as  the  effect  of  some  internal  change  or  endogenous  effect.  If  we  think  of  a 
univariate  time  series  model  as  an  aggregate  representation  of  the  pattern  of  behavior  of  a 
vector  xt  of  explicative  time  series  that  are  causing  the  observed  series  et,  the  noise 
of  the  univariate  model  represents  the  aggregate  of  the  nonsystematic  variation  of  the 


components  xfc.  An  exogeneous  intervention  outlier  in  any  of  the  components  will  produce 
an  anomalous  value  on  the  noise  of  the  univariate  process.  The  model  will  be 

♦<B)st  -  8(B)(at  +  wc*T>)  (2.2) 

where  the  atypical  behavior  appears  on  the  innovation.  This  model  has  been  called  the 
innovational  outlier  (Cheng  and  Tiao  (1983))  or  the  aberrant  innovation  model  (Abraham  and 
Box  (1979)). 

Calling  <MB)  -  $(B)“18(p)  both  types  of  outliers  can  be  written  as 

zt  -  v(B)w?‘T)  +  iMB)at  (2.3) 

where  v(B)  -  1+v^  +. . .  .  Expression  (2.3)  shows  that,  as  shown  by  Cheng  and  Tiao  (1983), 
both  types  of  outliers  are  particular  cases  in  the  general  intervention  analysis  model 
(2.3).  The  cases  v(B)  -  1,  additive  outlier,  and  v(B)  «  i|i(B),  Innovational  outlier, 
are  extreme  cases  in  this  representation  and  it  is  sensible  to  think  of  a  third  category  of 
time  series  outliers  in  which  v(B)  is  any  dynamic  transfer  function  response.  The  study 
of  this  general  class  of  outliers  is  still  to  he  made. 

Fox  (1972)  derived  the  maximum  likelihood  ratio  test  for  both  types  of  outliers  for 
autoregressive  processes.  Abraham  and  Box  (1979)  used  the  normal  contaminated  model  as  the 
basic  set-up  to  make  inference  in  both  types  of  model.  Denby  and  Martin  (1979)  developed 
generalized  M-eatimatora  for  the  first  order,  autoregressive  model  and  showed  that,  on  the 
one  hand,  no  great  loss  of  efficiency  is  expected  in  estimating  the  parameter  for  least 
squares  where  there  are  innovational  (type  II)  outliers  but,  on  the  other  hand,  if  additive 
outliers  are  present,  the  loss  of  efficiency  suffered  can  be  large.  Alba  and  Zartman 
(1980)  have  shown  examples  of  robust  estimation  for  ARIMA  models  and  Chernick,  Doming  and 
Pike  (1982)  have  studied  the  influence  function  for  the  autocorrelations  of  a  stationary 
time  series. 

Finally,  Chang  and  Tiao  (1983)  extended  Fox's  results  to  general  ARIMA  models  and 

suggest  a  useful  iterative  procedure  for  outlier  detection  and  parameter  estimation.  They 

recommend  computing  the  likelihood  ratio  statistics  A1  T  and  Aj  T  to  check  if  the 

observation  T  is  either  an  innovational  outlier  (X.  _)  or  an  additive  outlier 

1  ,T 

( A„  _).  These  statistics  are  given  by 

2  ,T 
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A  MEASURE  OP  INFLUENCE  FOR  ADDITIVE  OUTLIERS 


1 
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3.1.  The  chancre  in  the  parameter  estimates 


Suppose  we  have  a  stochastic  process  yt  that  follows  a  univariate  ARIMA  (p,  d,  q). 

It  is  assumed  in  what  follows  that  yt  represents  deviations  from  some  origin  u  that 

will  be  the  mean  if  the  series  is  stationary,  and  that  the  moving  average  part  has  a 

characteristic  equation  with  roots  outside  the  unit  circle  so  that  the  process  is 

invertible.  Then,  the  process  can  be  represented  as 

h 

yt  ’  Jii  w*yt-* + 

for  seme  lag  h.  If  the  process  is  purely  autoregressive  h  -  p+d,  otherwise  the  tt  co¬ 
efficients  are  obtained  from  «(b)  -  $ (b) ( 1-B)d0 (B)-1  and  because  of  the  invertibility  of 
8(B)  these  coefficients  will  decrease  and  eventually  will  become  zero  for  some  lag  h. 

Let  us  now  assume  that  an  additive  outlier  happens  at  time  T  and  instead  of  observ¬ 


ing  yt  we  observe  zt  where  zt  «  yt(tyr>  and  zt  -  yt  +  w(t-T) .  Then,  as  the  Jacobian 
of  the  transformation  from  yt  to  zt  is  one,  the  likelihood  function  for  the  observed 
series  zt  conditional  to  the  first  h  values  is 

i<w,o2,w)  -  -<-2^)ln  a2  -  l  (zt-Vxt)2  -  £  (*w  +  »tw  -  L’x^)2 
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where  »a  is  the  variance  of  the  noise  process  at,  x£  -  1*^-1' • • • '*t-h> ' 


,w  ),  and  *  -  -1.  The  set  of  indicies  S.  is  (h+1, . . . ,T-1 , 


T+h+1,...  n).  The  conditional  maximum  likelihood  estimates  of  v  and  w  are 

w  -  (x*x  r1  x  y 
y  y  y 


w  -  . 

T  T/n 


VVi  +  Vi> 


yh-1  •  •  •  y1 


yn-1  yn-2 


then  using  (3.3)  and  (3.4) 


x'y 

y 


where  S'  -e(z„  +  z„  , ,  z_, ,  +  z_  _ , 

T  T+1  T-1  T+2  T-2 


X’Z  -  w  S„ 

Z  T 

'*T+h  +  Vh1, 


Expressing  the  estimated  parameters  it  as  a  function  of  the  original  data 

A  A  A 

(X*X  -  w  A  )W  -  X  *Z  -  w 
z  z  T  z  T 

which  leads  to 


»  -  *  -  i(X'X  )_1(SW  -  aJo 

0  z  z  T  t 

and  calling  ET  the  vector  of  "pseudo-residuals'*  given  by 


we  obtain 


»„  -  *  +  w(X'X  )-1  E  . 

0  Z  Z  T 

To  study  how  additive  outliers  influence  the  parameter  estimates,  we  will  express 
x;xz  and  ET  in  (3.6)  in  terms  of  the  uncontaminated  process  yt.  As 

A 

A,  -  Z2  +  Vt  -  Wl 

using  that 

* 

Z2  «  Y2  +  wl 

then 


*T  "  y2  +  +  wl 


and  so 


ST  '  <Y2  +  Y2)W  '  W  W 


Inserting  this  expression  for  in  (3.6) 


(I  -  w2(X'  X  )-1)w  +  w(X'X  )_1(S„  -  (Y  +  Y')w) 
z  z  z  z  T  2  2 


If  now  w  ♦  assuming  a  fix  sample  size  n ,  as 


lim  w"2(X'X) 


lim  w"2(X  x  )  +  lim  w"1 (Y,  +  Yi)  ♦  I 

y  y  22 


and,  as  w  is  a  consistent  estimator  of  w,  when  w  ♦  •  the  limit  of  w2(X'X) 


-1 


is 


'  T»T 
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*  <  * 

•Iso  I.  Also,  w  (X’X)^  *  «  and  wtX'X)^  0.  In  practice  this  result  means  that 
when  w  is  large,  all  the  estimated  coefficients  tQ  are  pulled  down  towards  zero,  and 
the  series  will  appear  to  be  white  noise >  For  instance,  in  the  AR(1)  case 

w 


n  - 


2. *2. 


— )♦  + 


2.*2^ 


<yT+1  +  yT-1  '  2V> 


£yt+w  +2wyT  £yt+w  +2wyT 

*1  2  m  * 

and  if  w2  is  large  compared  to  Eyfc  the  value  of  will  be  much  smaller  than 

The  result  that  gross  errors  pull  all  the  autocorrelation  coefficients,  and  so  the 

estimated  paraaieters,  towards  zero  was  noted  by  Treadway  (1978)  and  Guttman  and  Tiao 

(1978).  An  example  of  this  problem  with  economic  time  series  can  be  found  in  Pena  and 

Sanchez-Albornoz  (1983). 

This  result  is  in  agreement  with  the  properties  of  the  estimated  parameter  for  a 
first-order  autoregressive  process  with  additive  outlier  given  by  Martin  and  Jong  (1977) 
and  Denby  and  Martin  (1979). 

3.2.  A  statistic  to  measure  Influential  outliers 

A  natural  way  to  measure  the  influence  of  observation  zT  is  to  relate  it  to  the 
change  on  the  parameter  estimates  when  this  observation  is  assumed  to  be  an  outlier.  As 

A  A 

»0  and  *  are  vectors,  a  useful  way  suggested  by  Cook  (1977)  is  to  measure  the  distance 
between  both  vectors  relative  to  a  relevant  positive  semidefinite  matrix  M.  A  natural 
selection  is  to  choose  M  as  the  variance  covariance  matrix  of  either  of  these  two 
estimated  vectors  and  to  build,  then,  a  Mahalanobis  distance.  In  order  to  have  a  common 
ground  to  compare  all  the  observations,  it  seems  more  useful  to  choose  M  as  the 
covariance  matrix  of  the  parameters  assuming  no  outliers  (see  Cook  and  Pena  (1984)),  then 


,  (i.f)'(X'  *  )(*  -*> 

VT) - - 


(3.7) 


h  0. 


where  we  have  divided  the  distance  by  the  dimension  of  the  vectors  involved,  h,  to  have  a 
proper  standardization. 

The  statistic  (3.7)  can  also  be  Interpreted  as  measuring  the  change  in  the  vector  of 
one  step  ahead  forecasts.  Using  the  estimated  parameters  assuming  no  outliers  this  vector 


-9- 


ft 


and  using  the  parameters  estimated  assuming  an  additive  outlier  at  T: 

A  A 

Z  -  X  IT  . 

2 

The  Euclidean  distance  between  both  vectors  of  forecasts  is: 

<z0  -  z)mz0  -  z)  -  <;0  -  w‘>*x;x2<;0 .  ;> 

and  so  DjCT)  can  also  be  interpreted  as  a  standardized  measure  of  the  Euclidean  distance 

A  A 

between  the  vectors  of  one  step-ahead  forecast  built  with  and  * . 

Using  (3.6),  the  statistic  can  be  written  as 


VT)  -rV  Ei(x;  v'1  et 

h  o 

a 


The  problem  in  computing  D2 (T )  is  that  the  estimates  w  and  E^  require  the 
nonlinear  estimation  of  the  intervention.  A  solution  first  suggested  by  Fox  (1972)  is  to 
substitute  w  for  another  consistent  estimator  of  w  easier  to  compute.  He  suggested 

A  A 

using  the  vector  instead  of  *  to  compute 


Et  ■  i  V±{ 

i«i  * 


2_.  .  +  Z_  ,  ) 

T+i  T-1 


where 


h-i 


50,i  *  ('o,i  "  ^  *0,tW0,t-i)/<J:,0,») 


and  w  .  is  the  jth  component  of  IT  .  Fox  (1972)  verified  using  simulation  that  the 

0,3  0 


approximation  was  good  for  moderate  sample  sizes.  In  the  same  spirit  we  suggest  using  ». 


instead  of  s  to  compute  E^.  Calling 


E  «  S  -  Ax. 
T  T  TO 


the  statistics  we  obtain  is 


d2(t»  •  vx;  v'1et 

o 

a 


(3.8 


The  likelihood  ratio  test  to  check  for  additive  outliers  is  asymptotically  equivalent 


(see  Chang  and  Tiao  (1983))  to 


~2 
w  . 


2,T  ;f(£;o,tr1 


and  ao,  D2  can  be  written 


Dj(T) 


2,1 


h 

h<  l 
£-0 


v*;  v'1  *t 


(3.9) 


0,1 


Equation  (3.9)  shows  the  difference  between  detecting  outliera  and  atudying  influence. 
The  influence  of  a  particular  value  can  be  decomposed  into  two  terms.  The  firat, 
x2  h-1^  l  ia  mainly  a  function  of  the  aiae  of  the  outlier  relative  to  the 

model.  fhe°eecond,  E^X^)-1^  ia  a  meaaure  of  the  relative  importance  of  the 
obaervatione  in  the  aeriea  around  the  point  in  which  the  outlier  happena.  If  we  call,  for 

A  *1 i » • *h 

Vi  *  Vi  -  ;iVh  — VV"’  ’hVi-h 

the  reaiduala  in  the  model  eatimated  allowing  for  outliera,  and 


Vi 


ZT-i  "  *lVi+1  Vt  *hVi+h 


the  backwards  reaiduala,  then  it  ia  atraightforward  to  a  how  uaing  the  definition  of  ET 


that, 

et  "  [Vi  +  Vi . Vh  +  Vh1  • 

So  the  quadratic  form  18  t“1'in9  lBt°  account  th8t  th"  lnP°rtanc*  of  tha 

outlier  in  the  parameter  eatlmatea  dependa  on  the  prevloua  and  posterior  h  observations. 

This  statistic  should  be  computed  aa  a  routine  diagnostic  check  for  time  series  models 
because,  aa  we  have  shown,  the  presence  of  additive  outliera  can  have  a  strong  influence  on 
the  eatimated  parameters  of  the  model. 


4.  A  Measure  of  Influence  for  Innovatlonal  outliers 


4.1.  The  effect  on  the  parameter  estimates 

Suppose  that  the  observed  series  zt  follows  the  model 


»(B)Zt  -  w(B)Ct 


(T) 


where  now  w(B)  ■  ${B) ( 1-B)^0 (B)  \ 


and  w<B)  ■  wq  +  w^B  +...+  Wg_^ 


»S-1 


represents  a 


general  dynamic  intervention  at  time  T,  with  transfer  function  v(B)  -  w(B)«(B)  .  The 
case  of  an  innovational  outlier  corresponds  to  making  w(B)  «  wq.  He  assume  that  the 


(4.1) 


process  is  invertible  and  so  the  *  coefficients  will  become  practically  zero  for 
finite  lag  h.  Then,  the  model  can  be  written: 

h  s-1 

Zt  “  i  Vt-i  +  i  Vt-1  +  at 

i-1  j-0  J  J 

(T) 

0  if  l  ?  T  and  >1.  He  assume  as  before  that  the  mean  level  of  the 

process  Zt  has  been  removed.  Then ,  the  least  squares  estimates  of  the  parameters,  that 
are  equal  to  the  conditional  maximum  likelihood  estimators,  are: 


.<T) 

‘t 


X'X  X'D 
z  z  <  z 
- 1 - , 

,D'X  1  D'D 
L  Z  l 


X'Z 

z 


D'Z 


where  Xz  and  z  were  defined  in  section  3  and 

D'  -  (0. 


°sx(n-h-T) ^ 


s*(T-h)  s*s 

and  so  D'D  -  I.  Using  now  the  expression  for  the  inverse  of  a  partitioned  matrix: 


X'X  »  X'D 
z  z  i 

-i 

(x'x  r1  i  o 

Z  Z  _j 

-(X'X  )_1X'D 
z  Z  Z 

r  j  i 

~  * 

-D'X  (X'X  r1  l  I 
z  z  Z  | 

.  a 

- — 

D'X  *  DD' 

L  z  i  J 

0  !  0 

I 

■ 

HI-D  HD#  | 

where  H  is  the  idempotent  matrix  XZ(X^XZ )-1X^.  Then,  after  some  straight-forward 
algebra: 


w  +  (X'X  )-1X'  w 
Z  Z  T,  s 


(I  -  D'HD)-1[Z, 


T,s  ^,8*0^ 


(4.2) 

(4.3) 


where  xQ  ”  (X^O  X^Z  is,  as  in  section  3,  the  vector  of  estimated  parameters  assuming 


no  outliers  and: 
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I 


'  ZT,S 


^-h+s-l 


^+•+1 


Let  ua  cell 


-  Zm  -  X_  ». 
T,a  T,»  T,a  0 


the  vector  of  estimated  residuale  for  tha  relevant  observations  assuming  no  outliers,  and 

let  “  D'HD  be  the  square  symmetric  matrix  of  dimension  a  that  contains  the 

distances  from  the  vectors  (Zt_j, . . . ,zt_h)  to  the  origins  then 

w  -  (I  -  H_  )-1B  _  .  (4.4 

T,a  T,s 

In  the  particular  case  of  an  lnnovational  outlier  w(B)  ■  v  and  (4.2)  and  (4.4) 


reduce  to 


-  t  +  w(X'X  )  X_ 
0  s  s  T 


*  *  Von"V  (4,6) 

A 

where  eTf0  ■  z,  -  is  the  reaidual  at  point  T  from  the  model  without  intervention, 

and  dip  -  X^(X'X)“1XT  is  the  distance  from  the  vector  x£  -  (ZT_1,.,.,ZT_h)  to  the 
origin. 

A 

As  an  example,  consider  the  AR(t)  case.  Calling  ♦  the  parameter,  (4.5)  reduces  to 

A 

*  W  ZT-1 
t  -  i  +  - - 

0  l  z\ 

A  A 

and  ♦q  can  be  greater  or  smaller  than  $  depending  on  the  sign  of  ZT_  1 •  It  is  clear 

A  A  A 

that  when  n  ♦  ",  the  least  aquare  estimator,  and  so  is  a  consistent 

estimator  of  $ . 

The  fact  that  under  lnnovational  outliers  the  least  squares  estimators  of  the 
parameters  of  an  autoregressive  process  are  consistent  was  first  obtained  by  Mann  and  Wald 
(1943).  Martin  and  Jong  (1977)  have  studied  the  efficiency  of  the  estimators  in  the  first- 
order  autoregressive  process  and  show  that  although  consistent  the  estimator  can  be  quite 


I 


inefficient 


These  results  explain  the  well  known  flexibility  and  adaptability  of  univariate  ARMA 
models  for  forecasting  purposes.  As  we  noticed  earlier,  this  type  of  outlier  can  be  inter¬ 
preted  ae  a  sudden  change  in  one  or  more  of  the  unknown  series  that  are  driving  the 
observed  series  Z^.  It  can  be  concluded  that  AFMA  models  are  fairly  robust  at  this  kind 
of  internal  perturbation. 


4.2.  Building  a  measure  of  influence  for  innovational  outliers 


We  consider  egain  model  (4.1)  that  has  a  general  dynamic  outlier  with  transfer 

.-1 


function  v(B)  -  w(B)w(B) 


Then ,  a  measure  of  influence  can  be  built  as  before  using 
D(T,w) 


(x0-v),M(»0-») 


where  M  is  a  positive  definite  matrix  that  defines  the  sMtric.  Choosing,  as  in  section 


3.2,  M  -  (X'X^o^ 


and  using  (4.2)  and  (4.4),  then 


D(T,w) 


T,S 


(I  -  »  _fV  (I 


T.a 


IT.  S' 


T,S 


>‘V 


T,  8 


(4.7) 


h  a 


where  we  have  used  that: 


«T,s  "  xT,s(xixs)"1xT,s 


The  statistics  D(T,w)  is  similar  to  the  one  proposed  by  Cook  and  Weisberg  (1982)  to 

analyze  the  influence  of  a  set  of  points  on  the  parameter  estimates  of  the  regression 

model.  The  similarity  is  clear  because  of  the  linear  structure  of  equation  (4.1).  If  we 

.(T) 


now  assume  that  w(B) 


(4.6): 


and  so  C. 


is  an  innovational  outlier,  using  (4.5)  and 


1  T 
D1(T)  "h— 


o  <1-d_) 
T 


^  ■ 
T 


(4.8) 


that  is  identical  to  the  statistics  suggested  by  Cook  (1977)  to  measure  the  effect  of  an 

observation  in  the  parameters  of  a  regression  model.  The  statistics  can  be  interpreted  as 

2-2  -1 

the  product  of  two  terms:  The  first,  e^  ( 1  ~<*T >  is  the  standardized  residual  at  the 
point  of  the  intervention,  and  the  second,  dj(  1-<JLj)- 1 ,  represents  the  distance  of  x^  to 


.  • 
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a 


* 


the  origin  but  with  relation  to  a  metric  build  without  taking  into  account  x^.  (T)  ean 

alao  be  expressed  aa  a  function  of  the  likelihood  criteria  advocated  by  Fox 
(1972)  and  Chang  and  Tiao  (1983)  to  teet  for  innovational  outliersi 


Di(T)  -  - 


Note  that  the  influence  of  the  outlier  observation  depends  now  only  on  the  relative 
values  of  the  h  observations  before  the  intervention,  as  measured  by  d,,  and  not  on 
the  h  posterior  observations,  in  contrast  with  what  happened  in  the  additive  outliers 
case.  In  the  next  section  we  will  compare  both  statistics. 
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5.  Comparison  of  both  statistics 
5.1.  Delating  versus  forecasting 

There  are  two  basic  theoretical  reasons  to  recommend  using  the  influence  measure  based 
on  additive  outlier  (3.8)  instead  of  the  one  derived  for  innovational  outliers  (4.8)  as  a 
routine  checking  device  in  univariate  time  series  model.  The  first,  is  that  additive 
outlier  are  expected  to  be  much  more  influential  than  innovational  outliers.  The  second, 
is  that  this  measure  provides  a  more  reasonable  generalization  to  dynamic  problems  of  the 
measures  derived  in  the  context  of  independent  observations. 

The  main  justification  for  the  influence  measures  suggested  for  the  linear  model  is 
that  these  measures  are  standardized  versions  of  a  finite  approximation  to  the  influence 
curve  introduced  by  Hampel  (1974).  (See  for  instance  Cook  and  Weisberg  (1982)  and  Welsch 
(1982)).  However,  all  this  theory  relies  on  a  sample  of  independent  and  identically 
distributed  observations,  which  is  obviously  inadequate  to  deal  with  stochastic  processes 
and,  in  particular,  with  time  series. 

In  the  regression  model,  for  instance,  the  empirical  influence  curve  for  one 
observation  can  be  expressed  as  the  difference  between  the  parameter  estimated  with  and 
without  this  observation.  This  idea  cannot  be  generalized  to  time  series  in  which  the 
deletion  of  one  observation  changes  the  dynamic  of  the  sample.  However,  it  is  well  known 
that  in  the  regression  situation  the  deletion  procedure  is  equivalent  to  treat  the 
observation  as  a  missing  value  and  estimate  the  model: 

Y-X8*C(T)WU  (5.1) 

where  Y  is  the  vector  of  response,  X  the  matrix  of  explicative  variables,  8  the 
vector  of  parameters,  is  a  dummy  or  intervection  variable  (as  defined  in  section  2) 

and  U  is  the  vector  of  perturbations.  Then ,  the  vector  of  estimated  parameters  from 
(5.1),  does  not  depend  on  the  T  observation  (yT,x?) . 

This  "missing  value”  procedure  can  be  extended  in  a  straightforward  idea  to  any 
dynamic  situation,  because  it  leads  to  substitute  the  observation  under  investigation  by 
its  forecast  using  all  the  sample,  instead  of  deleting  it.  Of  course,  in  the  regression 
case  of  independent  observations  both  procedures  are  equivalent,  but  they  are  not  in  the 
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time  series  context.  The  k«y  point  of  the  approach  ia  to  obtain  an  aatiaator  of  tha 
parameters  that  does  not  dapand  on  tha  data  undar  investigation . 

This  ia  tha  approach  uaad  in  tha  additiva  outlier  case,  and  it  ia  aaay  to  prova  that, 
in  contrast  with  tha  innovational  outliara  modal,  tha  parameter  estimated  obtained  for  the 

A 

additiva  outlier  caaa  doaa  not  dapand  on  the  observation  undar  question.  Note  that  *  is 

A  A  A 

given  by  (3.1)  and  as  yT  ”  z^/n '  th,t  Aoes  not  dapand  on  tha  value  ZT,  neither  nor 

A 

Y  depend  on  ZT. 

5.2.  Spate  simulation  results 

To  illustrate  tha  behavior  of  these  statistics  in  practice,  figure  1  shows  50 
simulated  values  of  an  AR(1)  time  aeries  with  +  ■  .7  in  which  an  additive  outlier  equal 
to  4  standard  deviations  of  the  series  has  been  added  in  position  33.  The  estimated 
paraaMter  value  drops  from  .7  to  .58  and  figure  1  displays  the  behavior  of  and 

02. 

The  maximum  value  of  the  Dj  statistic  occurs  in  observation  34,  instead  of  in 
observation  33  as  could  be  expected.  To  understand  why,  let  us  look  at  the  AR(1)  process 
as  a  regression  equation,  m  ♦xT  ♦  aT>  with  X^  “  Zp-v  Then,  the  first  component  of 
Dy  is  the  square  of  the  standardised  residual 

•I 

*2 

<»  <i-«y 

and  the  second,  dj ( 1-d^)- 1  is  a  measure  of  the  distance  of  xT  to  the  center  of  gravity 
of  the  previous  observations.  As  here  X^  -  ZT_ , ,  if  is  anomalous  because  of  the 

presence  of  an  additive  outlier,  then  x^+i  »  ZT  will  be  far  away  from  the  rest,  and  the 
term  dj,(  1-dT)- 1  will  be  very  big  and  can  dominate  the  product,  as  happens  here. 

The  statistic  D 2,  however,  shows  clearly  the  33  observation  as  atypical.  Note 
that  observations  and  Z^^  are  ahown  aa  influential  too,  although  less  that  Z^ 

as  expected.  In  general,  if  the  process  is  AR(p),  the  influential  effect  of  an  additive 
outlier  in  T  appears,  although  with  decreasing  effect,  on  observations  Zij.p  to  &j>+p 
as  well.  This  effect  is  symmetric  around  ZT.  For  instance  in  the  AR(1)  case,  it  ie 
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easily  seen  that  if  there  is  an  additive  outlier  w  on  time  I, 

A 

A 

Et-W0,T-1  I  ♦(>]  "  *^"o,T+J  “  ”  * 

*0 

This  effect  can  be  noticed  in  the  plot  of  the  wQ  values  displayed  in  figure  1. 

Figure  2  shows  a  second  simulation  of  the  same  process,  using  differenct  random 
numbers.  However,  the  outlier,  although  of  the  same  magnitude  (4o)  as  before,  has  now 
been  added  on  observation  30,  The  estimated  parameter  turns  out  to  be  ,4335,  which 
means  that  the  outlier  is  more  influential  in  this  second  case.  The  plot  of  D1  suggests 
that  observations  30**1  and  3 1**1  are  both  influential  and  roughly  in  the  same  amount. 
However,  the  statistics  D2  Indicates  clearly  to  observation  30t*’  as  the  most  influential. 

Table  1  displays  some  relevant  values  of  both  statistics  in  these  two  simulations. 

The  largest  value  of  D2(T)  in  the  second  simulation  shows  that  the  decrease  in  the 
estimated  value  is  greater,  as  we  have  seen.  In  both  cases  D1 (T)  shows  observation 
T  +  1  as  the  most  influential,  but  other  simulations  have  shown  that  this  result  does  not 
hold  in  general. 

In  summary,  for  additive  outliers  statistic  D2  seems  to  have  a  very  stable  behavior 
and  correctly  Identifies  the  outlier  value  as  more  or  less  influential.  On  the  other  hand, 
the  behavior  of  D1  is  not  so  consistent  which  makes  its  interpretation  and  use  less 
reliable. 

The  better  behavior  of  02  in  the  case  of  additive  outliers  is  not  surprising, 
because  this  statistic  has  been  built  precisely  to  measure  these  effects.  However,  the 
simulations  we  have  made  seem  to  indicate  that  its  behavior  is  very  consistent  for 
lnnovatlonal  outliers  as  well.  For  instance,  figure  3a  shows  the  result  of  simulating  50 
observations  from  the  model : 


v  r(4) 

where  Ct 

is  4  “  *®9, 

very  small. 


10 


t  -  1-.9B  ’t 
(40) 


c(40)  +  t 


1-.9B 

1.  The  estimated  parameter  when  fitting  an  ARM) 


'  0,  t  j  40,  and  ( 
and  so  in  this  case  the  effect  of  the  outlier  in  the  parameter  estimate  is 
The  maximum  value  of  (T)  is  .04  for  observation  40.  However  the 
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6«  An  Application 


The  data  that  will  ba  uaad  to  llluatrata  tha  uaa  of  tha  pravioua  atatiatica  is  tha 
aariaa  of  axtinctiona  of  marina  animal*  ovar  tha  paat  250  Billion  yaara  and  ia  diaplayad  in 
figura  4.  Theta  data  hava  been  atudiad  by  Raup  and  Sepkoakl  (1964)  and  show  pariodieity  in 
tha  paaka  of  axtinctiona  that  thay  attribute  to  datamlniatic  axtratarraatrial  cauaaa. 
Kitchall  and  Pena  (1984)  ahow  that  tha  obaarvad  pseudo-periodicity  can  ba  axplainad  by  a 
fifth  ordac  non  atationary  autoregressive  process  with  one  root  equal  to  1  and  four 
others  complex  roots. 

Table  2  displays  tha  original  death  rata  series  (0),  tha  residuals  of  tha  bast 
estimated  nodal  (E)  and  tha  values  of  tha  and  D2  statistics.  Both  of  those 

influence  statistics  are  plotted  in  figura  4. 

Again  D1  fails  to  Indicate  clearly  tha  influential  points  and  shows  peaks  in  tha 
32th  and  34th  observation.  However,  d2  pinpoints  without  any  doubt  observation  30.  The 
atypical  value  of  this  observation  ia  clear  from  figura  3  and  tha  residual  at  this  point  is 
outstanding  and  bigger  than  3  standard  deviations.  However,  the  small  valus  of  D2  for 
this  point  (.389)  suggests  that  this  observation  is  not  vary  influential  as  far  as  the 
parameter  values  are  concerned.  So,  although  there  are  only  39  observations  the 
autoregressive  model  is  very  robust  to  the  effect  of  a  single  outlier. 

Table  3a  presents  the  estimated  autoregressive  model  with  and  without  outliers.  As 
tha  data  are  proportions  different  transformations  has  been  used  to  test  the  sensitivity  of 

the  conclusion  to  the  swtric  of  the  data.  Table  3b  preaants  the  results  with  the  logit 

zt 

transformation  y.  -  in  7-  -.  it  can  be  seen  that  the  results  are  very  similar,  and  the 
t  i-zt 

same  hold  for  other  possible  transformations  that  have  been  applied. 


Table  2 


Original  series  (D),  estimated  residual  from  the  AR(5)  model  (E),  and 


statistics  D^(T), 

VARIABLE 

COLUMN  D 

and  Dj 

E 

<T). 

D,(T) 

d2(t> 

ROW 

1 

1 

1 

1 

1 

52.500 

.000 

.000 

2 

21.000 

••• 

.000 

.000 

3 

24.000 

*** 

.000 

.000 

4 

12.800 

•  •• 

.000 

.000 

5 

15.900 

*•* 

.000 

.000 

6 

26.400 

-.142 

.002 

.005 

7 

38.600 

.703 

.005 

.051 

8 

15.900 

-.643 

.011 

.030 

9 

2.600 

-1.867 

.104 

.044 

10 

10.100 

.  140 

.001 

.000 

11 

15.200 

-.306 

.004 

.021 

12 

7.100 

-1.429 

.097 

.047 

13 

11.600 

.361 

.003 

.004 

14 

3.500 

-.567 

.006 

.003 

15 

7.600 

.014 

.000 

.000 

16 

6.000 

-.527 

.005 

.006 

17 

9.800 

.289 

.002 

.028 

18 

19.500 

.949 

.016 

.012 

19 

3.800 

-.823 

.014 

.047 

20 

3.600 

-.479 

.005 

.010 

21 

9.500 

.650 

.011 

.000 

22 

6.000 

-.679 

.010 

.003 

23 

10.200 

.087 

.000 

•  C08 

24 

12.000 

.859 

.016 

.051 

25 

18.900 

1.111 

.021 

.000 

26 

9.900 

-.167 

.000 

.047 

27 

5.800 

-.374 

.001 

.031 

28 

9.200 

.107 

.000 

.001 

29 

14.700 

.257 

.001 

.087 

30 

66.300 

2.384 

.056 

.381 

31 

22,200 

-.054 

.001 

.011 

32 

21.900 

.760 

.216 

.022 

33 

11.100 

-.230 

.020 

.019 

34 

36.700 

.851 

.250 

.033 

35 

45.800 

.031 

.000 

.000 

36 

29.400 

-.080 

.000 

.000 

37 

20.000 

-.114 

.000 

.000 

38 

12.500 

-.417 

.000 

.000 

39 

25.000 

-.076 

.000 

.000 
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Table  3 


MODEL 

COD 

;2 

a 

a)  (1  +  .66B  +  .56B2  +  .71B3  +  .38B4)7Zt  -  at 
(4.2)  (3.8)  (4.6)  (2.56) 

6.2 

122.8 

Z  -  42.681  +  t 

(6.71)  7(1  +  . 32B  +  .56B  +  .45B  +  .22B 

(1.94)  (3.75)  (2.71)  (1.55) 

4.1 

63.65 

b)  (1  +  .62B  +  .59b2  +  .62B3  +  .42B4)Vy  -  a 

(4.0)  (3.8)  (3.9)  (2.7) 

7.0 

.574 

y  -  2.051  +  234 

,  7(1  +  .49B  +  .65B  +  .43B  +  .40B 

(3.1)  (4.2)  (2.5)  (2.7) 

fl 

.440 

Zt  la  the  extinction  rate  seriea  (series  D  in  table  2)  and  Yt  is  its  logit 
transformation  Yt  »  in  z^/HOO  -  Zt ) ,  B  is  the  backshift  operator,  V  ■  1-B,  I30  is  an 

indicator  variable  with  1(30)  ■  1  and  I(i)  “  0  Vi  y  30,  Q(g)  is  the  Ljung-Box 

*2 

statistics  with  g  degrees  of  freedom  and  c#  is  the  residual  variance  of  the  model. 
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